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Diagnosability  of  Stochastic  Chemical  Kinetic  Systems:  A  Discrete  Event 

Systems  Approach 

David  Thorsley 


Abstract — We  consider  the  problem  of  detecting  events  of 
interest  in  a  stochastic  chemical  kinetic  system  from  the 
perspective  of  discrete-event  systems  theory.  We  define  a  class  of 
discrete-event  systems,  timed  stochastic  automata,  that  is  well- 
suited  for  modeling  stochastic  chemical  kinetics  and  define  tA— 
and  t  A  A— diagnosability,  two  appropriate  notions  of  diagnos¬ 
ability  for  this  class  of  system.  We  develop  the  construction 
of  a  timed  stochastic  diagnoser  that  is  used  to  provide  online 
updates  of  the  probability  that  an  event  of  interest  has  occurred 
and  a  means  for  offline  testing  of  diagnosability  conditions.  The 
results  of  the  paper  are  illustrated  using  a  model  of  stochastic 
gene  expression. 

1.  Introduction 

In  the  mass  action  formulation  of  chemical  kinetics,  the 
evolution  of  the  state  in  a  reaction  chamber  is  a  deterministic 
process  governed  by  a  set  of  non-linear  differential  equations. 
This  classical  formulation  breaks  down  when  the  number  of 
molecules  of  any  of  the  reactants  in  the  chamber  is  low, 
as  is  often  the  case  with  intracellular  processes  [1],  [2];  in 
this  situation,  fluctuations  from  the  mean  cellular  behavior 
play  an  important  role.  The  effects  of  noise  and  variability 
inside  the  cell  have  been  observed  experimentally  [3]  and 
quantitatively  analyzed  [4],  [5]. 

In  the  standard  stochastic  formulation  of  chemical  kinetics 
[6],  [7],  the  system  evolves  dynamically  as  a  discrete  event 
process.  The  state  of  the  system  is  deflned  as  a  vector  where 
each  element  is  the  number  of  molecules  of  each  species  in 
the  chamber,  and  events  that  transition  the  system  between 
states  are  the  brings  of  reaction  channels  that  correspond  to 
random  intermolecular  collisions. 

To  analyze  stochastic  chemical  kinetic  systems,  we  model 
the  process  as  a  stochastic  discrete-event  system  in  the 
Ramadge-Wonham  framework  [8].  This  class  of  model  has 
been  used  extensively  in  the  study  of  applications  such  as 
communication  systems,  software  veriflcation,  and  industrial 
processes.  Recently,  these  models  have  been  considered  in 
biological  applications  such  as  the  consistency  in  metabolic 
network  representations  [9] . 

In  this  paper,  we  consider  a  problem  caused  by  the  fact 
that  we  are  technologically  limited  in  our  observation  of 
intracellular  processes.  The  dynamic  behavior  of  a  single  cell 
can  be  observed  experimentally  at  multiple  time  points  by 
taking  time-lapse  movies  [10].  In  these  experiments,  only 
a  few  species  of  interest  can  be  measured  as  there  are  not 
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enough  non-interfering  reporters  to  measure  more  than  two 
or  three  species  at  a  time.  There  may  be  many  events  of 
interest  in  the  behavior  of  the  system  that  cannot  be  directly 
observed,  such  as  the  population  of  a  chemical  species  going 
above  or  below  a  certain  threshold,  a  genetic  switch  (e.g. 
[11])  turning  on  or  off,  a  cell  differentiating  itself  into  one 
of  multiple  phenotypes  [12],  and  so  forth.  In  order  to  draw 
conclusions  about  the  dynamic  behavior  of  unobservable 
species  and  the  occurrence  of  unobservable  events  of  interest, 
we  develop  in  this  paper  the  concept  of  diagnosability  for 
stochastic  chemical  kinetic  systems  based  on  discrete-event 
system  theory. 

In  developing  these  concepts,  we  exploit  the  fact  that 
under  the  standard  assumptions  of  stochastic  chemical  ki¬ 
netics,  the  interarrival  times  between  events  are  distributed 
exponentially  because  the  reaction  chamber  is  modeled  as  a 
continuous-time  Markov  process  [13].  The  knowledge  of  the 
distribution  of  interarrival  times  allows  for  the  extension  of 
the  results  on  diagnosability  of  untimed  stochastic  discrete- 
event  systems  of  [14].  While  prior  studies  of  diagnosis  in 
discrete-event  systems  exploit  timing  information  in  order  to 
facilitate  diagnoses  [15],  [16],  [17],  the  models  considered  in 
these  studies  are  not  probabilistic  and  thus  cannot  assume, 
as  we  do  in  this  paper,  that  the  interarrival  times  between 
events  are  distributed  exponentially.  Furthermore,  the  cited 
papers  do  not  propose  conditions  for  diagnosability  of  a 
timed  discrete  event-system.  On  the  other  hand,  methods  of 
discrete-event  diagnostics  that  do  incorporate  probabilistic 
information,  such  as  [18],  [19],  do  not  incorporate  continuous 
timing  information  into  the  discrete-event  model. 

In  this  paper  we  build  upon  the  results  of  [14]  to  make 
the  following  contributions.  We  propose  a  model  of  a  timed 
stochastic  automaton  appropriate  for  describing  stochastic 
chemical  kinetic  systems.  We  deflne  two  notions  of  diag¬ 
nosability,  tA-  and  t AA-diagnosability  for  this  class  of  timed 
stochastic  automata.  Expanding  upon  the  approach  developed 
in  [14],  we  develop  a  timed  stochastic  diagnoser  to  calculate 
the  a  posteriori  probability  distribution  given  a  series  of 
observations  and  we  derive  conditions  for  tA-  and  tAA- 
diagnosability  based  on  the  structure  of  the  timed  stochastic 
diagnoser.  The  results  are  illustrated  with  a  basic  model  of 
stochastic  gene  expression. 

II.  System  Model  and  Preliminaries 
A.  Timed  Stochastic  Automata 

A  timed  stochastic  automaton  is  a  quadruple  G  = 
(A,  i;,r,  TTo),  where  A’  is  a  flnite  set  of  states,  E  is  a  flnite 
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set  of  events,  r  \  X  xYj  x  X  ^  is  the  rate  function, 
and  TTo  is  the  initial  probability  distribution  over  A'. 

For  a  pair  of  states  x^x'  and  an  event  a,  r{x\a  \  x) 
denotes  the  rate  at  which  the  system  transitions  to  x'  through 
the  occurrence  of  the  event  a,  given  that  the  current  state  is 
X.  For  each  state  and  event,  we  define  the  event  transition 
rate  to  be  •=  ^x'^x  ^  I  define  the  exit 

rate  of  state  x  to  be  :=  We  make  the  liveness 

assumption  that  >  0  for  all  x  G  A’. 

Let  e  denote  the  empty  string.  The  partial  transition 
function  5  associated  with  G  is  defined  recursively  to  be 

5{x^  e)  =  X 

S{x,  a)  =  {x'  G  A'  :  r(x',  a  |  x)  >  0} 

5{x^SG^  ^x' 5^)5 

where  the  last  definition  applies  for  any  5  G  S*.  Using  5, 
we  define  the  language  generated  by  the  state  x  to  be 

C{G,x)  :=  {5GS*:^(x,5)7^0}, 

where  S*  denotes  the  Kleene-closure  of  S.  The  language 
generated  by  G  is 

C{G)-.=  y  C{G,x). 

x:'Kq{x)>0 

A  finite  sample  path  uj  =  {5,  T,  r}  consists  of  a  string  5  = 
(71(72  ...  G  C{G),  an  ascending  sequence  of  arrival  times 
T  =  (U,  ^2,  •  •  • ,  and  a  duration  r  >  We  denote  the 
set  of  all  finite  sample  paths  by  ft  and  the  finite  sample  path 
of  duration  zero  by  0  =  {e,  0,  0}.  We  denote  by  {5,  r}  the 
set  of  all  finite  sample  paths  of  length  r  that  contain  the 
string  5,  regardless  of  the  sequence  of  arrival  times. 

The  concatenation  of  two  finite  sample  paths  cci  = 

and  002  =  {52,72,^2}  is  defined  to  be  ccico’2  := 
{5i52,(7i,ri  +72),ti  +T2},  where  the  notation  ri  +  72 
indicates  that  ri  is  to  be  added  to  each  element  in  the 
sequence  T2. 


B.  Observable  events  and  events  of  interest 

In  this  paper,  we  consider  deterministic  mask  functions, 
generalized  versions  of  the  projection  function  used  in  [14]. 
We  define  a  set  of  output  symbols  A  and  define  an  event 
mask  function  M  :  S  ^  (Auj^}).  The  symbol  5  denotes 
the  null  output  and  corresponds  to  no  signal  being  observed 
when  an  event  takes  place;  5  is  not  an  element  of  A.  If 
M((7)  =  5,  then  (7  is  unobservable  and  we  define  T^uo  •= 
{(7  G  S  :  M{(t)  =  e}.  All  other  events  are  observable  and 
we  define  So  :=  S  \  S^o-  U  is  possible  for  two  distinct 
observable  events  (7i,  (72  to  have  the  same  observed  output, 
i.e.  it  may  be  that  =  M{cf2). 

We  extend  the  mask  function  to  finite  sample  paths  recur¬ 
sively  by  defining 


M^{e) 

r}) 


■■=  e, 

if  cr  e  Eo 

\s  if  (T  e  s„o 

=  r}), 


where  concatenation  of  output  samples  is  defined  analo¬ 
gously  to  concatenation  of  finite  sample  paths. 

The  inverse  mask  function  M~^  is  defined  to  be 

(ju 

Mp{y)  =  {co  en  :  M^{co)  =y}  .  (1) 

We  define  a  set  of  events  of  interest  S/  C  S.  The  objective 
of  the  diagnosis  problem  is  to  determine  the  probability 
that  an  event  in  Sj  has  occurred  given  an  output  sample. 
The  objective  of  the  diagnosability  problem  is  to  determine 
conditions  under  which  we  can  ensure  that  any  occurrence 
of  an  event  of  interested  will  be  detected.  For  simplicity, 
we  will  only  consider  the  case  where  we  are  attempting  to 
diagnose  events  of  interest  of  only  one  type;  the  results  of 
this  paper  can  be  extended  to  the  situation  where  events  of 
interest  are  divided  into  multiple  types  of  interest  (following 
the  approach  described  in  [20]). 

Denote  by  ^(S/)  :=  {cc  =  {s,T,t}  :  5  =  s'f,f  G 
S/,T  =  tn}.  If  an  event  /  G  S/  is  an  element  of  a  string 
5,  we  write  that  Sj  G  5.  If  5  is  the  string  associated  with  a 
finite  sample  path  cc,  we  write  Sj  G  cc. 


C.  Defining  the  Probability  Distribution 

The  interpretation  of  the  quantity  r{x',a  \  x)  is:  given 
that  the  current  state  of  G  is  x,  the  probability  that  the 
event  cr  will  occur  and  transition  the  system  to  x'  in  the 
next  dt  seconds  is  r{x' ,cf  \  x)dt.  It  follows  from  this 
interpretation  that  the  process  is  Markovian  and  that  the 
interarrival  times  between  the  occurrences  of  events  are 
distributed  exponentially  [21,  Ch.  8]. 

Using  this  interpretation  of  the  rates  we  can  specify  the 
probability  distribution  over  the  sets  of  upward  closures 
finite  sample  paths  following  a  procedure  similar  to  [13, 
Ch.  2].  First,  we  define,  for  all  oo  e  ft,  tt^,  the  probability 
distribution  over  A’  after  the  occurrence  of  the  finite  sample 
path  UJ.  These  distributions  are  defined  recursively  to  be 

TToix)  :=  TToix), 

)  •  V— \  /  \  _ ^  f 

T.xex^UxFcT,xe 

We  define  the  probability  of  finite  sample  path  of  duration 
r  in  which  no  event  occurs  to  be 

Pr({e,T}  I  TTw)  =  (2) 

x£X 

This  probability  is  this  probability  that  the  arrival  time  of 
the  first  event  is  greater  than  r.  For  finite  sample  paths  with 
strings  of  greater  length  the  probability  distribution  can  be 
specified  recursively.  First  we  define  the  probability  of  a 
string  consisting  of  one  event  occurring  somewhere  in  the 
interval  (0,  r)  to  be 


Pr({(T,T}  I  TT^)  =  I  7Tuj{x)r{x' ,  (7  I  x) 

xexx'ex^^ 
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Fig.  1.  A  timed  stochastic  automaton  with  state  space  X  = 
{a^o,  ici,  X2, 2:3}  and  event  set  S  =  {n,  /,  a,  b}. 


Because  the  sets  (0,  r)  are  a  generating  class,  this  expression 
is  sufficient  to  define  the  probability  of  an  event  occurring 
at  a  time  in  set  in  the  Borel  a-field  on  the  nonnegative  reals. 

For  sample  paths  with  strings  containing  more  than  one 
event,  we  can  define  the  probabilities  of  such  strings  recur¬ 
sively  to  be 

Pr(cJ2CU3  I  TTc^i)  =  Pr(cU3  I  7rc^ia;2)Pr(^2  |  TTc^i),  (4) 


where  the  diagnosability  condition  function  D  :  Q  -x  {0, 1} 
is 


D{uJiUJ2) 


if  u;  G  ^  S/  G  cc 

otherwise. 


(6) 


This  definition  makes  two  assertions.  The  first  of  these  is 
that  for  any  occurrence  of  an  event  of  interest,  almost  every 
finite  sample  path  of  sufficient  length  after  the  event  will 
allow  us  to  detect  the  occurrence  of  the  event.  The  second 
of  these  it  that,  in  order  to  detect  the  occurrence  of  an  event 
of  interest,  we  must  be  completely  sure  that  there  has  been 
at  least  one  occurrence  of  the  event.  In  any  finite  amount  of 
time,  a  t^d-diagnosable  system  will  allow  the  possibility  of 
a  false  negative;  however,  in  the  long  run,  the  probability  of 
a  false  negative  must  approach  zero. 

The  second  definition  of  stochastic  diagnosability  we  pro¬ 
pose,  t^d^d-diagnosability,  is  weaker  than  M-diagnosability 
as  the  second  of  the  assertions  weakened. 

Definition  3.2:  A  timed  stochastic  automaton  is  tAA- 
diagnosable  if 


(Ve  >  0)  (Va  <  1)  (3T  >  0)  (Vcji  G  (Vt  >  T) 

Pr  (w2  ;  Dc«(wiW2)  =  0  |  A  T2  =  t)  <  e  (7) 


as  the  Markovian  property  of  the  system  implies  that  the 
distribution  of  sample  paths  after  the  duration  of  ujiuj2  is 
independent  of  the  distribution  of  sample  paths  before  that 
duration,  given  the  distribution 

Example.  Fig.  1  shows  a  small  timed  stochastic  automaton 
that  we  use  as  a  running  example.  The  state  space  of  this 
automaton  is  T'  =  {xq, xi, X2, ^3}  and  the  event  set  is  H  = 
{u^  /,  a,  b}.  The  transition  rates  between  states  are  as  shown 
in  the  figure  and  we  set  7ro(xo)  =  1.  The  set  of  unobservable 
events  is  'Euo  =  {u,f}  and  the  set  of  events  of  interest 
is  'Ey  =  {/}.  The  probability  of  any  Borel  measurable  set 
of  finite  sample  paths  can  be  calculated  from  the  transition 
structure.  For  example,  the  probability  of  the  occurrence  of 
the  string  ua  along  a  finite  sample  path  of  duration  r  is 

Pr  ({ita,T})=  [  [ 

Jo  Jti 

III.  Definitions  of  Diagnosability 

Two  definitions  of  stochastic  diagnosability  were  proposed 
in  [14]  for  stochastic  discrete-event  systems  without  timing 
information.  In  this  paper,  we  modify  those  definitions  so 
as  to  apply  to  the  case  when  the  event  arrival  times  are 
distributed  probabilistically.  The  first  of  these  definitions,  tA- 
diagnosability,  is  defined  as  follows. 

Definition  3.1:  A  timed  stochastic  automaton  is  tA- 
diagnosable  if 

(Ve  >  0)  (3T  >  0)  (Vcci  G  ^{Ef))  (Vt  >  T) 

Pr  (cc2  :  D{ujiuj2)  =  0  |  Ar2  =t)  <  e  (5) 


where  the  diagnosability  condition  function  Da  :  ^  {0?  1} 

is 

Da{iOiiU2) 

1  if  Pr(i;/  G  cc  I  cc  G  M~^{Maj{uJiUJ2)))  > 

0  otherwise. 

(8) 

In  tAA-diagnosability,  the  diagnosability  condition  function 
D  used  in  tA-diagnosability  is  replaced  by  Da’,  using  Da, 
we  no  longer  need  to  be  exactly  sure  that  a  fault  has  occurred 
in  order  to  consider  it  diagnosed  -  it  is  sufficient  that  the 
probability  of  failure  be  above  the  threshold  a.  Thus  an 
tAA-diagnosable  system  will  allow  false  positives  with  a 
probability  1  —  a.  The  definition  of  tAA-diagnosability  states 
that  for  almost  all  finite  sample  paths  of  sufficient  length,  we 
can  almost  surely  reduce  the  probability  of  false  positives 
until  it  is  eventually  reaches  zero. 

IV.  Timed  Stochastic  Diagnoser 

The  timed  stochastic  diagnoser  serves  two  main  purposes. 
The  first  is  to  calculate  the  a  posteriori  probability  distribu¬ 
tion  given  the  observation  of  a  finite  sample  path.  The  second 
is  to  allows  us  to  find  necessary  and  sufficient  conditions  for 
ta-  and  tAA-  diagnosability. 

We  first  define  the  set  of  labels  to  be  L  =  {N^Y};  the 
label  “N”  indicates  that  no  failure  event  in  Ef  has  occurred 
and  the  label  “Y”  indicates  that  there  has  been  at  least  one 
occurrence  of  an  event  in  Yj .  As  events  occur  along  a  sample 
path,  the  label  associated  with  the  system  evolves  according 
to  the  label  propagation  function  LP  :  L  x  Y*  ^  L 

LP(«,.)=|"  if  «  =  W  O"*!  S/ S' s 
Y  otherwise. 
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The  timed  stochastic  diagnoser  is  a  septuple  SD  = 

Q  C  is  the  set  of 

supports.  Each  support  g  G  Q  is  a  list  of  components, 
where  each  component  is  a  pair  (x,^)  G  A’  x  L.  A  set 
of  components  {(^i,  ^i),  (^2,  ^2),  •  •  • ,  (^n^^n)}  is  certain  if 
=  ^2  =  •••  =^n-  The  components  in  each  support  need 
to  be  placed  into  a  particular  order;  this  order  can  be  chosen 
arbitrarily.  The  event  set  of  the  timed  stochastic  diagnoser, 
is  equal  to  A,  the  output  set  of  G. 

To  specify  the  deterministic  partial  transition  function 
we  need  to  first  define  the  unobservable  reach  of  a 
state  X  G  A;  the  unobservable  reach  is  determined  by  the 
function  UR  :  X  x  L  ^  2^^^ 


UR{x,  £)  =  {{x',£')eA:xL:3se  H  £(G,  x) 

such  that  {S{x,  5),  LP{£,  5))  =  (x',  £')}  ,  (9) 

where  2^^^  is  the  power  set  of  A’  x  L.  The  unobservable 
reach  of  a  component  {x^£)  is  the  set  of  all  components  that 
are  reachable  from  {x^£)  through  the  occurrence  of  unob¬ 
servable  events.  The  deterministic  partial  transition  function 

^SD 

is  defined  to  be 


5^^{q,<j)-.=  UR\  U  {6{x,<7),LP{£,a)) 

\ix,i)eq 

The  initial  support  is 

qo=  U  UR{x,N). 

x^P(^:7To{x)>0 

The  above  four  elements  define  the  “logical”  structure  of 
the  timed  stochastic  diagnoser  and  are  equivalent  to  the 
diagnoser  described  in  [20],  [22]. 

We  append  three  elements  to  this  logical  diagnoser  struc¬ 
ture.  To  each  transition  between  support  in  the  timed  stochas¬ 
tic  diagnoser,  we  append  a  discrete-update  matrix  a). 

We  define  each  element  in  this  matrix  to  be 

r{xj^  a  \  Xi)  if  LP{£i,  a)  =  £j,  Xi  7^  Xj 

0  otherwise. 


Each  element  in  a)  is  the  rate  at  which  the  observable 

event  cr  fires  from  a  given  component  in  q  transitions  to 
a  given  component  in  ||<f>'^(g,  cr)||.  The  size  of  the  matrix 
1 1  1 1  is  the  number  of  components  in  by 

the  number  of  components  in  g;  each  element  of  a  matrix  in 
is  nonnegative. 

Similarly,  for  each  support  in  the  timed  stochastic  diag¬ 
noser,  we  append  a  continuous-update  matrix  <f>^(g)  which 
is  defined  element-wise  to  be 


^ae^uo--LP{£i,a)=£j  5  ^  I  ^0  ^  7^  2 

-Tx  i  =  j 


size  equal  to  the  number  of  components  in  g;  all  off-diagonal 
elements  of  ||<l>^(g)||  are  nonnegative  and,  as  a  result  of  the 
liveness  assumption,  all  diagonal  elements  are  negative.  The 
sum  of  any  column  of  any  matrix  in  is  nonpositive. 

The  initial  probability  distribution  vector  has  a  length 
equal  to  the  number  of  components  in  go.  If  we  enumerate 
the  components  of  go  as  ci ,  C2 , . . . ,  and  express  each 
component  as  q  =  {xi^£i),  then  we  define  0o  element  by 
element  as  0o,i  =  Po{xi)  if  £i  =  N  and  zero  if  £i  =  Y. 

As  stated  as  the  beginning  of  this  section,  the  timed 
stochastic  diagnoser  can  be  used  to  calculate  the  a  posteriori 
probability  distribution  on  A  x  L.  Given  cCq,  a  finite  output 
sample  path  of  duration  t,  we  denote  by  0t('  |  ^o)  the  prob¬ 
ability  distribution  vector  generated  by  the  timed  stochastic 
diagnoser  after  observing  cCq.  This  vector  is  updated  in 
accordance  with  the  following  theorem. 

Theorem  4.1:  The  a  posteriori  probability  distribution 
(j)t{'  I  00 o)  Updates  recursively  according  to  the  following 
equations 

4>T+t{-  I  I  Ojo)  (10) 

Ac 

0T+t(-  I  iOo{(J,t,t})  =  I  Wo), 

Rd 

(11) 

where  q'  is  the  support  reached  by  the  timed  stochastic 
diagnoser  following  the  occurrence  of  the  observation  path 
uOo  and  Kc  and  Kd  are  normalization  constants  that  ensure 
that  the  probability  distribution  vector  sums  to  1. 

Proof:  We  prove  the  correctness  of  the  continuous- 
update  equation  Eq.  10.  The  proof  of  correctness  for  the 
discrete-update  equation  Eq.  11  is  analogous  to  Theorem  1 
of  [14]  and  is  omitted  for  space. 

Suppose  the  timed  stochastic  diagnoser  is  in  support  q'  at 
time  T,  and  let  c(T  +  t)  denote  the  component  that  describes 
the  true  state  of  the  timed  stochastic  automaton  at  time  T  +  t. 
Then 

Pr(c(r  +  t)  =  {x,£)  I  Wo{e,  0,t}) 

=  ^  Pv{c{T  +  t)  =  {xX) 

{x',£')eq' 

/\c{T)  =  {x',£')  I  Wo{e,  0,i}) 
Let  Kc  =  Pr  ({e,  0,t}  |  ujo).  It  follows  that 

Pr  (c(T  +  t)  =  {x,  £)  I  ujo{e,  0,  t}) 

=  A  Pr  (c(T  +  t)  =  {x,  £)A 

^  (x',£')eq' 

c{T)  =  {x\£')A{e,9U}\oOo). 

Conditioning  on  c(T)  yields 


Each  non-diagonal  element  in  T>^(g)  is  the  rate  at  which 
unobservable  events  fire  from  a  given  component  in  g  and 
transition  the  system  to  another  given  component  in  g.  Each 
diagonal  element  is  the  exit  rate  at  which  all  events  fire  from 
a  given  component.  The  matrix  (g)  is  a  square  matrix  with 


Pr  (c(T  +  t)  =  {x,£)  I  uJo{s,9U}) 

=  f-  Y  Pr(c(r  +  f)  =  (a7,f)  A{e,0,f} 

^  {x',£')eq' 

I  c{T)  =  {x',£')  A  uJo)  Pr  (c(T)  =  {x',£')  \  cvo)  ■ 
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Fig.  2.  The  timed  stochastic  diagnoser  of  the  timed  stochastic  automaton  shown  in  Figure  1.  The  recurrent  components  in  this  Markov  chain  are  (xi,  Y) 
in  support  qi  and  (xs,  N)  in  support  q2.  For  clarity  of  presentation,  we  write  the  component  (xi,  £)  as  i  i. 


Order  the  components  in  q'  as  ci , . . . ,  arbitrarily.  Then 
we  can  re-write  the  above  in  vector  form,  yielding 


0T+t('  I  UJo{s,9,t})  =  A{t) 


Pr  (c(T)  =  Cl  I  ujo) 


Pr  (c(T)  =  Cn  I  UJo)_ 


where  Aiy(t)  =  Pr  (c(t  +  T)  =  c^  A  {c,  0,  f}  |  c(T)  =  cy). 
A(t)  is  a  constituent  part  of  the  transition  semigroup  of 
a  continuous-time  Markov  process  constructed  from  the 
components  ci , . . . ,  c^  and  a  dump  state  whose  infinitesimal 
generator  is 


corresponds  to  the  known  initial  component  (xq^N).  The 
components  of  the  matrix  ^^(go),  shown  inside  support  go, 
correspond  to  transition  rates  in  Figure  1;  for  example,  the 
^2  1  (^o)  =  -3,  the  transition  rate  from  xq  to  xi  in  the  original 
timed  stochastic  automaton.  The  other  entries  in  and 
are  similarly  derived. 

Suppose  that  we  observe  the  finite  sample  path  cjo  = 
{aa,{l,2},3}.  Then  the  a  posteriori  probability  distribution 
03(-  I  cJo)  is  given  by 


03 (•  I  ^o) 


K 


exp 


0 

-2.1 


(3-2) 


1  0 
0  2 


O' 

( 

■-1  0 

\ 

"0  1  o' 

-lT^Q(q')  0 

X  exp  I 

0  -2.1 

(2  -  1)  j 

0  0  2 

Taking  the  matrix  exponential  of  this  infinitesimal  generator 
yields 


exp 


(\ 


A(t)  O' 
1  -  A(t)  1 


(12) 


which  follows  from  the  fact  that  the  only  paths  from  Ci 
to  Cj  in  this  process  are  those  along  which  no  observable 
event  occurs,  and  thus  the  (i,  j)th  element  of  the  transition 
semigroup  is  Pr  (c(t  +  T)  =  c^  A  {c,  0,  t}  \  c{T)  =  Cj).  It 
immediately  follows  that  A(t)  =  and  thus  that 

I  Wo{e,0,i})  =  |  ojo).  (13) 

-A-c 


If  no  event  is  observed  along  an  interval  of  duration 
t,  then  the  evolution  of  the  probability  distribution  vector 
is  governed  by  the  exponential  of  the  continuous-update 
matrix  T>^(gQ),  as  the  support  of  the  distribution  does  not 
change  if  no  events  are  observed.  If  an  event  is  observed 
after  a  duration  t,  a  discrete  update  described  by  the  matrix 
^^(gQ,cr)  occurs  as  the  probability  distribution  vector  is 
transitioned  to  a  new  support.  The  elements  of  ^^(go,cr) 
re- weight  the  components  of  the  new  support  in  accordance 
with  the  event  that  was  observed. 

Example.  The  timed  stochastic  diagnoser  SD  associated 
with  the  timed  stochastic  automaton  shown  in  Figure  1  is 
shown  in  Figure  2.  The  initial  support  go  contains  the  initial 
state  Xq  appended  with  the  label  N  and  its  unobservable 
reach,  the  components  (xi^Y)  (reached  by  an  occurrence  of 
/)  and  (x2,  N)  (reached  by  an  occurrence  of  u).  The  initial 
probability  vector  is  0o  =  [l  0  O]  as  the  first  component 


—  .6 

0 

0 

\ 

.3 

-1 

0 

1 

0 

_  -3 

0 

-2.1 

) 

0 

Solving  this  expression  and  choosing  K  so  as  to  normalize 
the  vector  yields  03(-  |  ujo)  =  [-782  .218].  Thus  the 
probability  of  the  system  being  in  state  xi  and  the  event 
of  interest  having  occurred  is  78.2%  and  the  probability  of 
the  system  being  in  state  X2  and  the  event  of  interest  not 
having  occurred  in  21.8%. 


V.  Conditions  for  Diagnosability 


In  this  section  we  derive  conditions  for  tA-  and  tAA- 
diagnosability  from  the  structure  of  the  timed  stochastic 
diagnoser. 

For  each  pair  of  states  g^,  gy  G  Q,  let 


^  :6^ ^  (qi  ,a)=qj 

(14) 

where  1  is  an  indicator  function.  We  use  these  fl-matrices 
to  construct  the  matrix 

"fl(gi,gi)  •••  fl(gi,gn)' 


Q{SD)  = 


yt{qn,qi) 


')  Qn\ 


(15) 


The  matrix  Q{SD)  can  be  used  to  construct  a  Markov  chain 
whose  states  are  the  components  of  each  support  in  SD 
according  to  the  following  lemma. 

Lemma  5.1:  Q{SD)  is  the  infinitesimal  generator  for  a 
continuous-time  Markov  chain,  i.e.,  its  columns  sum  to  zero 
and  only  its  diagonal  elements  are  negative. 
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Proof:  Suppose  the  ith  column  of  Q{SD)  corresponds 
to  a  component  {x,i)  is  the  timed  stochastic  diagnoser  state 
q.  Then 

QiiiSD)  =  -r^  +  ^  r{x,a  \  x),  (16) 

aeE:LP{e,a)=e 

which  is  nonpositive  since  all  terms  in  the  summation  are 
also  terms  in  the  summation  that  defines  .  All  nondiagonal 
elements  of  Q{SD)  must  be  nonnegative  because  they  are 
defined  as  the  sums  of  transition  rates,  which  are  always  non¬ 
negative.  By  construction,  the  sum  of  a  column  in  Q{SD) 
is 


Eq.  ,{SD)  =  -r,+ 

j  x' ctGS 


which  equals  zero  by  the  definition  of  ■ 

Because  the  components  in  a  timed  stochastic  diagnoser 
are  states  of  a  finite  state  Markov  chain,  they  can  be  classified 
as  either  recurrent  or  transient  from  the  structure  of  Q{SD). 
We  know  that,  in  the  long  run,  the  probability  that  the 
timed  stochastic  diagnoser  will  reach  a  recurrent  component 
approaches  one.  Thus  by  analyzing  the  properties  of  the 
recurrent  components,  we  can  derive  conditions  for  tA-  and 
tAA-diagnosability,  as  these  conditions  describe  the  long¬ 
term  behavior  of  the  system. 

Theorem  5.1:  A  stochastic  automaton  G  is  tA- 
diagnosable  if  and  only  if  every  support  of  its  timed 
stochastic  diagnoser  containing  a  recurrent  component 
bearing  the  label  Y  is  certain. 

Theorem  5.2:  A  stochastic  automaton  G  is  tAA- 
diagnosable  if,  in  every  support  of  its  timed  stochastic 
diagnoser,  the  set  of  recurrent  components  within  the 
support  is  certain. 

The  proofs  of  these  theorems  are  analogous  to  the  proofs  in 
[14]  and  have  been  omitted  for  space. 

Example.  For  the  timed  stochastic  diagnoser  shown  in 
Fig.  2,  the  associated  continuous-time  Markov  chain  has  the 
infinitesimal  generator  (2l{SD)  shown  below: 


{qo,xo,N) 

{qo,Xi,Y) 

{qo,X2,N) 

(qi^xi.Y) 

{qi,X2,N) 

{q2,X3,N) 


-.6 

.3 

.3 

0 

0 

0 


0 

-1 

0 

1 

0 

0 


0  0  0  0 

0  0  0  0 

-2.1  0  0  0 

0  0  0  0 

2  0  -.1  0 

.1  0  .1  0 


By  inspection,  the  recurrent  components  in  this  Markov 
chain  are  (xi,E)  in  support  qi  and  (xs^N)  in  support  q2. 
It  follows  that  the  timed  stochastic  automaton  is  not  tA- 
diagnosable  because  the  support  qi  is  not  certain.  However, 
the  timed  stochastic  automaton  is  tAA-diagnosable  because 
the  set  of  recurrent  components  within  each  support  is 
certain. 


VI.  Stochastic  Gene  Expression 

A.  Modeling  Chemical  Reaction  Networks  as  Timed  Stochas¬ 
tic  Automata 

Consider  a  simple  model  of  stochastic  gene  expression 
[23]  consisting  of  two  species,  messenger  RNA  (mR)  and  a 


fluorescent  protein  (P).  The  model  consists  of  four  chemical 
reactions: 


i?i  : 

0^ 

>  mR 

i?_i  : 

mR 

^0 

i?2  : 

mR 

%mR  +  P 

R-2. 

A0. 

In  keeping  with  the  standard  formulation  of  stochastic  chemi¬ 
cal  kinetics  [6],  we  construct  a  timed  stochastic  automaton  as 
follows.  We  initially  define  the  state  space  A’  as  No  x No,  each 
state  being  a  vector  [umR^np]  containing  the  populations  of 
both  of  the  species  in  the  reaction  network.  To  ensure  that 
the  state  space  of  the  timed  stochastic  automaton  is  finite, 
we  cap  the  population  of  mRNA  at  15  molecules  and  the 
population  of  protein  at  2000  molecules. 

Assume  that  P  is  fiuorescent  and  thus  that  its  population 
of  P  is  observable.  Since  the  firing  of  reactions  R2  and 
R-2  changes  the  protein  population,  these  events  are  also 
observable.  The  population  of  mR  is  unobservable  and  thus 
the  firings  of  Ri  and  R-i,  which  only  change  the  mRNA 
population,  must  be  unobservable  events. 

We  wish  to  determine  if  the  population  of  mR  ever  exceeds 
9  molecules.  In  order  to  do  thus,  we  separate  the  firings 
of  Ri,  the  reaction  the  increases  the  mR  population,  into 
two  events:  “special”  R\,  which  increases  the  mR  population 
from  8  to  9,  and  “normal”  Ri,  which  fires  from  all  other  mR 
population  levels.  We  thus  define  the  event  set  of  the  timed 
stochastic  automaton  to  be  E  =  {i?i, i?_i, i?2, ^-2}, 
the  observable  events  to  be  E^  =  {i?2,  ^-2},  and  the  events 
of  interest  to  be  Ej  = 

We  specify  the  transition  rates  between  states  as  a  result 
of  the  firing  of  Ri  to  be 

r{[nmR  +  l,np],i?i,  [nmR,np])  =  ki  if  8 

r([9,np],i?i,  [8,np])  =  fci, 


where  the  reaction  RI  is  the  event  of  interest.  For  the  other 
reactions,  we  set 

r{[nmR  +  [n^p,np])  =  k-iUmR 

r{[nmR,np  +  =  k2nmR 

r{[nmR,np  —  l],i?_2,  [rimR.np])  =  k-2np. 


We  specify  the  initial  distribution  over  A’  through  the 
marginal  distributions  over  the  species;  we  set  7ro(np  =  0)  = 
1  and  we  set  the  mR  population  to  be  a  Poisson  distribution 
with  parameter  A  =  5. 

B.  Diagnosability  Results 

Constructing  the  full  timed  stochastic  diagnoser  for  this 
system  results  in  2001  separate  supports,  one  for  each 
possible  population  of  protein.  A  typical  support  of  the  timed 
stochastic  diagnoser,  with  the  protein  population  set  to  n,  and 
the  transitions  out  of  that  support,  is  shown  in  Fig.  3.  The 
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Fig.  3.  A  typical  support  of  the  timed  stochastic  diagnoser  for  gene  expression.  The  reactions  R2  and  R-2  transition  the  diagnoser  from  the  support 
where  the  protein  number  is  n  to  the  supports  where  the  protein  number  is  n  +  1  and  n  —  1,  respectively.  For  simplicity  we  write  =  ki-\-  mk-i  + 

mk2  +  nk—2. 


Protein  number  (Observed  variable)  Continuous  Diagnoser  Dynamics 


(a) 


mRNA  number  (Hidden  variable) 


(b) 

Fig.  4.  A  simulated  trajectory  of  the  stochastic  gene  expression  reaction 
network,  (a)  The  event  R2  changes  the  observed  output  of  the  reaction 
network,  the  fluorescent  protein  population,  (b)  The  events  Ri,  and 
R-i  change  the  unobserved  state  variable,  the  mRNA  population. 


recurrent  components  of  the  timed  stochastic  diagnoser  are 
all  components  with  the  “Y”  label,  because  every  state  in  the 
original  timed  stochastic  automaton  is  reachable  from  every 
other  state.  As  a  result,  the  system  is  not  tA-diagnosable 
because  there  exist  recurrent  components  that  do  not  lie  in 
certain  supports.  However,  the  system  is  tAA-diagnosable 
because  the  set  of  recurrent  components  in  each  support  is 
certain. 

To  illustrate  the  online  operation  of  the  timed  stochastic 
diagnoser,  we  generate  a  finite  sample  path  of  duration  t  = 
[0, 1440]  using  Gillespie’s  stochastic  simulation  algorithm 
[7].  The  parameter  values  are  chosen  as  ki  =  .0554  mRNA/s, 
k-i  =  .0113  (mRNA.s)“^,  k2  =  .17  protein/(mRNA.s), 


Fig.  5.  The  dynamics  of  the  timed  stochastic  diagnoser.  The  trajectory  from 
Figure  4  is  the  input.  The  probability  that  the  event  of  interest  occurred  is 
plotted  against  time. 


and  k-2  =  0  [24].  The  protein  decay  rate  was  chosen  to 
be  zero  for  simplicity  because  the  diagnoser  performance  is 
independent  of  k-2-  The  generated  trajectory  is  shown  in 
Figure  4. 

By  inspecting  the  evolution  of  the  mRNA  population,  we 
can  conclude  that  an  event  of  interest  first  occurred  when 
the  mRNA  number  is  first  equal  to  9,  which  occurs  at 
approximately  t  =  620  seconds. 

The  evolution  of  the  diagnoser  output  over  time  is  shown 
in  Figure  5.  In  between  increases  in  the  protein  number, 
the  probability  that  the  event  of  interest  occurred  decreases 
because  the  states  with  the  highest  mRNA  populations  (i.e. 
the  states  where  protein  production  is  most  likely)  are  only 
reachable  after  an  event  of  interest.  It  is  less  likely  that 
the  protein  level  remains  constant  in  these  states  and  thus 
the  a  posteriori  probability  of  the  event  of  interest  having 
occurred  decreases.  Similarly,  when  an  increase  in  the  protein 
population  is  observed,  the  probability  of  the  event  of  interest 
having  occurred  increases.  Notice  that  the  largest  jumps  in 
the  a  posteriori  probability  of  the  event  of  interest  having 
occurred  correspond  to  the  fastest  increases  in  the  protein 
population.  The  first  large  increase  actually  occurs  before 
the  event  of  interest  does;  by  inspecting  the  trajectory,  we 
can  see  that  the  rate  at  which  protein  is  being  produced  when 
this  increase  occurs  is  very  high. 
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VII.  Discussion 

In  this  paper  we  investigate  the  problems  of  detecting 
events  of  interest  in  stochastic  chemical  kinetic  systems  using 
the  formalism  of  discrete-event  systems.  We  define  a  class  of 
state  machines,  timed  stochastic  automata,  that  is  appropriate 
for  modeling  stochastic  chemical  kinetics  and  define  condi¬ 
tions  for  diagnosability  appropriate  for  this  class  of  system. 
We  then  develop  the  procedure  for  constructing  a  timed 
stochastic  diagnoser  that  can  be  used  to  give  online  updates 
as  to  the  probability  that  an  event  of  interest  has  occurred 
and  offline  conditions  for  timed  stochastic  diagnosability. 

The  performance  of  the  timed  stochastic  diagnoser  de¬ 
pends  on  the  accuracy  of  the  parameters  of  the  original 
chemical  kinetic  model;  as  these  parameters  can  never  be 
known  exactly,  in  future  work  it  is  important  to  develop 
notions  of  robust  diagnosability  for  uncertain  discrete-event 
systems.  For  large  models  containing  many  distinct  reactions 
and  species,  there  is  an  issue  of  scalability  as  the  size  of  the 
timed  stochastic  diagnoser  grows  to  an  intractable  size  (note 
that,  for  the  stochastic  gene  expression  example,  a  system 
with  4  reactions  produced  a  timed  stochastic  diagnoser  with 
2001  supports).  Investigation  of  more  efficient  algorithms  for 
online  probability  updates  and  offline  testing  of  diagnosabil¬ 
ity  conditions  is  ongoing. 
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